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0.  Introduction  and  Summary 


In  the  classical  theory  of  competing  risks  [cf.  the  excellent 
monograph  of  Bimbaum  (1979)3  it  is  assumed  that  (a)  the  risks,  i.e.,  the 
random  variables  of  interest  are  independent  and  (b)  death  does  not  result 
from  simultaneous  causes.  The  classical  estimator  for  the  marginal  distri¬ 
butions  of  interest  in  the  competing  risks  problem  is  that  of  Kaplan  and 
Meier  (1958)  or  generalizations  thereof  Ccf.  Peterson  (1975.  1977)3  • 

Lang  berg,  Proschan.,  and  Quinzi  (1981)  thereafter  referred  to  as  LFQ(1 981)3 
obtain  strongly  consistent  estimators  for  the  unobservable  marginal  distri¬ 
butions  of  interest  when  assumptions  (a)  and  (b)  above  fail  to  hold.  These 
estimators  resemble  those  of  Kaplan  and  Meier  (1958).  In  Section  1,  we 
examine  the  competing  risks  problem  in  the  presence  of  dependent  risks 
and  state  a  number  of  known  results.  In  Section  2,  we  establish  the 
asymptotic  normality  of  the  LFQ(1981)  estimators.  Only  the  outline  of 
proof  is  given.  This  preliminary  report  represents  work  currently  in 


progress. 
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1.  The  competing  risks  model 

Let  there  be  a  finite  number  of  causes  of  death  labelled  1,  r. 

We  associate  with  each  cause  j  a  nonnegative  random  variable  T y  j  =  1 ,  . . . , 
The  random  variable  represents  the  age  at  death  if  cause  j  were  the  only 
cause  present  in  the  environment.  The  complete  collection  of  random 
variables  T^,  ....  Tr  is  not  observed.  Instead,  only  two  quantities  are 
observed:  the  age  at  death  given  by  7*  =  min  (Tj ,  ....  Tr)  and  the  cause 
of  death,  labelled  £,  given  by  I  such  that  £(T)  =  I,  where  J| 

represents  the  collection  of  nonempty  subsets  of  { 1 ,  ....  r}  .  Thus, 

£ (T)  =  I  if  and  only  if  T  =  T^  for  each  i  €  I  and  Tt  \  for  each  i  £  I. 
When  death  results  from  exactly  one  of  the  r  possible  causes,  as  is  usually 
assumed,  then  £  is  the  index  i  for  which  T  =  T^.  The  biomedical 
researcher  is  interested  in  making  inferences  about  the  unobservable  random 
variables  Tj ,  ....  Tr  by  using  information  from  the  observable  quantities, 
namely  the  life  length  T  and  cause  of  death  £  .  In  particular,  he  seeks 

p 

to  estimate  the  2-1  survival  probabilities 

M/t)  =  P  £min  (T y  j  €  J)  >  tj  ,  J 
We  use  the  following  notation  throughout.  If  T  is  a  nonnegative  random 
variable  with  distribution  function  F,  then  F  =  1  -  F. 

LPQ(197Q)  prove  the  following 

Theorem  1.1.  Let  T  =  min  (T^ ,  ...,  Tr),  where  T1 ,  ...,  Tr  are  nonnegative 
random  variables.  Define  F(t,  I)  =  P( T  >  t,  £(T)  =  I),  F(t,  I)  = 

P(  T £  t,  £(T)  =  I),  l€e|  ,  F(t)  =  2i€jj  ?(t,  I)  and  F(t)  =  1  -  F(t). 


Then  the  following  statements  hold: 


(i)  A  necessary  and  sufficient  condition  for  the  existence  of  a  set  of 
indeoendent  random  variables  f  H^.  i«*i)  which  satisfy 

P(  T  >  t,  $(T)  =  I)  =  P  [min  (Hj,  I  €  Jt  )  >  t,  Hj  <  HJf  each  J  ^  ij 

is  that  the  functions  F(* ,  I),  I  ,  have  no  common  discontinuities  in 

the  interval  [  0,  CL  (?)),  where  flf  (F)  =  sup  <{  t:  1  -  F(t)  >  o}. 

(ii)  The  random  variables  £  Hj,  I  C  }  in  (i)  have  corresponding  survival 
probabilities  f  Gj(*)»  I  €  Jt  }  »  0j(t)  =  P(H^  >  t),  which  are  uniquely 
defined  on  the  interval  C 0,  flt(F)j  as  follows: 

O.D  0  (t)  =  5  [F(a)/F(a")]'  exp  f  -  f  *  dF°(* ,  I)/?  1  .  0  $  t  £  CL  (F), 

x  a  <  t  wo  j 

r 

where  F  (• ,  I)  is  the  continuous  part  of  F(*  ,  I),  the  product  is  over  the 
discontinuities  fa}  of  F(* ,  I),  I  ,  and  the  product  over  an  empty 
set  is  defined  as  unity. 

Remark  1.2.  Although  motivated  by  the  competing  risks  model.  Theorem  1.1 
applies  to  any  model  where  observations  include  (1)  the  time  at  which  a 
particular  event  occurs  and  (2)  the  identity  of  the  causes  (among  a  finite 
number)  which  result  in  the  occurrence  of  the  event.  For  example,  suppose 
a  personnel  study  is  undertaken  to  study  the  departure  patterns  of  employees 
in  a  large  company.  The  data  on  each  employee  might  consist  of  (1 )  length 
of  stay,  i.e.,  the  time  from  arrival  to  termination,  and  (2)  the  reason  for 
termination.  Here,  each  employee  terminates  (dies)  for  one  or  more  of 
several  reasons  (causes). 
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Remark  1.3.  Formula  (1.1)  represents  each  distribution  in  the  independent 
collection  (H^,  I  €  cl  }  as  a  function  of  the  (observable)  cause-specific 
subdistribution  functions  F(* ,  I),  I  €  <Jl  ,  as  well  as  the  (observable) 
survival  function  F(t)  =  P(  T  >  t).  It  is  this  representation  of  distri¬ 
butions  in  the  independent  collection  by  observable  functions  which  plays 
a  key  role  in  the  estimation  problem. 

Let  =  (Tli§  ....  T^),  i  =  1,  ,  n,  represent  a  random  sample 

from  the  joint  distribution  of  the  nonnegative  random  variables  Ti ,  ...»  Tr. 

For  each  J  €xi  ,  let  Mj(t)  =  P  [min  (T^,  j  €  J)  >  t],  Forj  €  {  1 . r}  , 

we  write  M^(t)  instead  of  M  ^  ^  (t).  For  each  i  =  1,  ...,  n,  only  T±  and 

Ci  are  observed,  where  T ^  =  min  (T1i#  ...,  T^)  and  =  J  whenever 

7^  =  Tj^  for  each  j  €  J  and  7\  4-  T  for  each  j  ^  J.  It  is  important 

to  note  that  we  have  not  made  either  of  the  two  classical  assumptions,  namely 

(a)  the  risks,  i.e. ,  the  random  variables  Tn ,  ...,  T  are  independent;  and 

1  V 

(b)  death  does  not  result  from  simultaneous  causes,  i.e.,  P(T^  =  Tj)  =  0 

for  i  f  j.  If  (a)  and  (b)  hold,  the  function  MT(t)  may  be  estimated 

J 

(consistently)  [  cf .  Peterson  (1975)]  by  using  a  generalized  version  of 
the  Kaplan-Meier  (1958)  (product-limit)  estimator 

(1.2)  M.(t)  =  H[(n-  i)/(n-  i  +  1)3, 

J  i 


where  the  product  is  over  the  ranks  i  of  those  ordered  observations  TCi) 
such  that  r(i)  ^  ^  r(n)  and  7"^  corresponds  to  a  death  from  at 
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least  one  cause  j  f  J,  If  T(n)  corresponds  to  a  death  from  a  cause 

A  6 

j  (  J,  then  Mj(t)  is  defined  to  be  zero  for  t  >  T^.  Otherwise,  Mj(t) 

is  undefined  for  t  >  T^.  [  In  the  original  formulation  by  Kaplan  and 

Meier  (1958),  r  =  2  and  corresponded  to  the  time  until  death,  while  T2 
corresponded  to  the  time  at  which  a  loss  occurred. 3 

Suppose  now  that  it  is  not  assumed  that  ,  ...»  Tr  are  independent. 
LPQ(1981 )  prove  the  following 

Theorem  1.4.  Let  ,  ....  be  nonnegative  random  variables  such  that 
the  functions  F(t,  I)  =  P(  T  <  t,  § (2)  *  I),  I  6  J|  ,  have  no  common 
discontinuities.  Define  Ji ^  =  {  J  :  jni^0}.  Fix  I  cJt  .  Then 

for  each  t  €  £0,  (X(F)  ]  , 

(1.3)  Kj.(t)  =  ^  11^  GjCt) 

if  and  only  if 

(l.ha)  Mj-CaJ/MjCa")  =  f  F(a)/F(a’),  a  6  V(4j) 

^  1,  otherwise; 

and 

d.4b)  p(  ri  >  1 1  rl  =  t)  =  p(  rx,  >  1 1  r1  >  t), 

where  0  is  given  by  (1.1),  D(J  )  is  the  set  of  discontinuities  of  the 
J  I 

function  F(t,  J?  ^  =  P(T£  t,  $(T)  €  Jl  j),  Tj  =  min  (Tit  i  €  I)  and 
I'  is  the  complement  of  I  in  {l,  ....  r}  . 
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Re marie  1.5.  By  finding  consistent  estimators  for  the  functions  Oj  in  (1.3)* 
LFQ(1981)  establish  that  (1.4a)  and  (1.4b)  are  necessary  and  sufficient 
conditions  on  the  joint  distribution  of  T.j ,  ....  Tr  for  the  existence  of 
a  consistent  estimator  of  in  (1*3). 

Remark  1.6.  Suppose  that  the  random  variables  =  min(Tit  i  €  I),  I  *4  . 
have  absolutely  continuous  distributions.  Let  Oj(t)  [respectively,  Mj(t)  ] 
and  ^  ^(t)  [respectively,  j  ^,(t)3  denote  the  density  (respectively, 
survival)  function  and  conditional  density  (respectively,  conditional  survival) 
function  of  and  *T  given  T’j,  >  t.  Then  condition  (1.4b)  is 
equivalent  to 


“i  1 1.  |  I« (t)  = 


In  other  words,  the  conditional  failure  rate  function  of  Tj  given  Tj.,  >  t 
is  equal  to  the  (unconditional)  failure  rate  function  of  Tj.  Stated 
differently,  the  random  variables  and  are  independent  "along  the 
diagonal  =  Tj, ".  This  property  of  "diagonal  independence"  is  of 
importance  in  the  case  of  dependent  competing  risks  and  is  presently  being 
studied  by  the  authors.  Desu  and  Narula  (1977)  arrive  at  a  condition 
similar  to  (1.4b)  in  the  special  case  when  T^ ,  ...,  Tr  have  a  joint  distri¬ 
bution  which  is  absolutely  continuous. 

Suppose  now  that  the  functions  F(t,  I),  I  €  ,  have  no  common  dis¬ 

continuities.  We  make  no  assumption  as  to  the  independence  of  T1 ,  ....  Tr. 
In  view  of  (1.3).  a  natural  estimator  for  is 


(1.5) 
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where  G.  (t)  is  obtained  from  the  right  side  of  (1.1)  by  replacing 
u  ,n 


F(* ,  I)  and  F  by  their  empirical  counterparts 


F„(t,  I)  =  n-1  r  X 


(Ti St.  - 1} 


and 


F  (t)  = 
n 


-1 

n 


n 

'  iT  > 


S‘) 


where  X.  is  the  indicator  function  of  the  set  A.  LPQ(1 981 )  show  that  in 
A 

this  case,  (1.5)  is  a  (strongly)  consistent  estimator  for  Mj  when  (1.4a,b) 
hold. 

Remark  1.7.  If  ,  ....  Tr  are  independent  and  P(Tj_  =  Tj)  =  0  for  i  ^  j, 
then  (1.5)  reduces  to  the  usual  Kaplan-Meier  (1958)  estimator  (1.2)  or 
a  version  thereof. 

Remark  1.8.  Suppose  for  a  moment  that  we  make  no  assumption  on  the  underlying 
distribution  of  ,  ....  Tr  except  that  the  functions  F(t,  I),  I  ,  have 

no  common  discontinuities.  Let  °  5  T(0)  '  •**  ~  T(n)  <  T(n+1)  2  00 

denote  the  ordered  values  of  times  T!j ,  ....  *T  at  which  deaths  occur. 

We  do  not  exclude  the  possibility  of  multiple  deaths  at  T We  thus 
obtain  the  (possibly  degenerate)  intervals  [0,  T^j).  T(2j)«  •••» 

1  oo )  such  that  the  number  of  deaths  in  any  interval  is  exactly  one. 


For  each  interval  £  T(jj 
viduals  alive  just  after 


,  T^+1  j),  estimate  the  proportion  pj  of  indi- 
that  survive  the  interval  as  follows* 


let  N(t)  =  the  number  of  individuals  observed  and  surviving 


and  5  4  =  N(  T^j")  -  N(  T^j)  =  the  number  of  deaths  at 


at  t(  when  deaths  due  to  cause  I  (but  not  deaths  due 
to  any  other  cause)  at  t  itself  are  subtracted  off; 

i  '(j;  '  (j)' 

Then  the  estimate  of  p^  above  is 

VC«T0p- 


Now,  to  estimate  the  probability  of  surviving  until  t  if  cause  I  were 
the  only  risk  present  in  the  environment,  Kaplan  and  Meier  (1958)  calculate 


(1.6)  Hr*(t)=  ftp,. 

j=1  3 

For  any  given  set  of  data,  formula  (1,6)  yields  the  same  numerical 
estimate  as  formula  (1.5).  Recall,  however,  that  (1.5)  is  a  consistent 
estimator  of  if  and  only  if  (1«4a,b)  hold.  Yet,  even  in  the  face  of 
ignorance  about  the  truth  or  falsity  of  (1.ha,b),  we  know  precisely  what 
parameter  of  the  underlying  distribution  is  being  estimated  (consistently) 
by  (1.6).  namely 


(1.7) 


where  Q  is  given  by  (1.1).  To  the  authors*  knowledge,  this  fact  has 

J 

never  been  pointed  out. 
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2.  The  main  result 

In  this  section  we  outline  a  proof  of  the  fact  that,  viewed  as  a 
process  in  t,  the  estimator  (1.5)  converges  to  a  Gaussian  process.  As  a 
result,  we  extend  a  result  of  Breslow  and  Crowley  (1974)  from  the  case  of 
a  continuous  survival  to  an  arbitrary  survival  function.  For  simplicity, 
we  assume  here  that  the  distribution  F  of  T  has  finitely  many  discontinui¬ 
ties.  The  case  of  a  countable  infinity  of  discontinuities  will  be  presented 
in  a  subsequent  report. 

We  inquire  into  the  asymptotic  distribution  of 


(t) 


n 

a  <  t 


C  Pn(a)/Fn(a-)3  , 


where  the  last  product  is  over  the  set  of  observations  £  a  }  such  that 
T ^  -  a  and  £  =  J,  i  =  1,  n.  Let  this  set  •{  a)  of  points  be  denoted 

by  D(n,  j)  and  let  C(J)  [  D(j)  3  be  the  set  of  continuities  (discontinuities) 
of  the  function  F(t,  J).  Then  we  can  write 

(2.1)  (t)  -  0  (t)]  =>fnTe  ’  -eJ  3 

n,«J  o 

=  V^rHnfJ(t)  -  Hj(t)]  eH(t>  +^C^lij(t)  -  HJ(t)32eH  (t). 

where  H  ,(t)  =  Z  In  [  F  (a)/?  (a')]X  (a)  , 

n*J  a  <  t  n  n  J  D(n,  J) 

H  (t)  =  £  In  C  F(a)/F(a")T  Y  (a)  -  f  ^  dFC(»,  J)/F,  and  the  function 

J  a  5  t  D(J)  *  0 

H* (t)  is  between  j(t)  and  Hj(t). 
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We  consider  first  the  asymptotic  distribution  of  V""  [ ^  j(t)  -  Hj(t)]  . 
We  have 

-  H^t)]  =  Vn[An>J(t)  -  AjCt)]  +V"  C^jU)  -  Bj(t)J  • 


where  A  (t)  =  £  In  [  Fn(a)/Fn(a*)]  X  (a)  *  X  (a)  , 

n’J  a  <  t  n  n  D(n,  J)  D(J) 

A,(t)  =  £  lnCF(a)/F(a-)]X(a)  , 

J  a  <  t  D(J) 

BnJCt)=  £  m[F  (a)/Fn(a-)]XOO  *  X  (a)  ,  and 

n,J  a  ^  t  n  n  D(n,  J)  C(J) 

Bj(t)  =  -  Jq  .  J)/f. 

We  can  now  state 

Theorem  2.1.  Assume  that  each  function  F(' ,  J),  J  £  Jt  ,  has  finitely 
many  discontinuities.  Fix  J  (4  and  let  0  <  a1  <  . . .  <  ak  <  oo  denote 
the  discontinuities  of  F(t,  J).  Then  the  k-dimensional  random  vector  whose 
ith  component  is 

ZitoZhWopi  -ln  Crfr^/Ffrj-)]} 

converges  in  distribution  to  a  k-dimensional  multivariate  normal  with  mean 

vector  0  and  covariance  matrix  £  =  (  C.  .) ,  which  can  be  represented  thus: 

•J 


‘  •  b4 


31  bi  +b2  b!+b2 


Vb2 


b1+b2+b3 


•  •  •  ^2 

•  • •  bj  +bg+b3 


where  b4  * 


Lbt  b^*b2  b1«rt)2+b3  •••  bj+  •••  ^  b^ 

n 

[F^)/!^)}  -  [Ffap/FCap]  ,  i  =  1 . k. 
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The  proof  of  Theorem  2.1  is  straightforward  and  is  omitted. 

Now  define  a  process  Z.  .  (t)  in  D  =  D  [  0,  0C(F)]  whose  finite-dimensional 

J,  i 

distributions  are  multivariate  normal  with  EZj  1  (t)  =  0  and 


CovfZj  1  (s) ,  ZJf1(t)] 


for  s  €  [  ait  ai+1),  t  €  [  aj,  a^),  s  <  t 

Cr44  for  s  >  t; 
v  J* 


where  <T  is  the  (i,j)th  entry  of  %  in  Theorem  2.1  and  D[0,  (X  (F)  3  is 


the  space  of  functions  on  [[0.  0((F)3  that  are  right-continuous  and  have 
left-hand  limits.  Such  a  process  exists  by  Theorem  15*3  of  Billingsley  (1968). 
Theorem  2.2.  The  process  V^On  j(t)  -  Aj(t)3  converges  weakly  to  Z^  ^ (t) 
as  n  -V  ®  . 

Proof.  Note  that  j(t)  -  Aj(t)3 


for  t  ^  Cai»  For  §< 


min 


|  a^  -  a .  |  ,  it  is  easily 


seen 


j+1  Ki.JSk 

that  w x"(5)  =  sup  min  (  |  x^t)  -  )  |  ,  |  xn(t2)  -  x^t)!  )  =  0, 

where  x^t)  =  V^CAn  j(t)  -  Aj(t)3  and  the  supremum  extends  over  t,  ti ,  t2 
such  that  t^  <  t  <  anc*  tg  -  t^  <  5  .  The  theorem  follows  from 
Theorem  2.1  above  and  Theorem  15. 4  of  Billingsley  (1968).  II 


Breslow  and  Crowley  (197*0  show  that  the  pair  Yn)  €  dCo,  fl((F)3  * 

D  [0,  a  (F)  1  defined  by  Xn  =  <Fn  -  F) .  t* C(J)  (a)*D(J,n)  <a>/n 

-  FC(* ,  J)j  converges  weakly  to  a  bivariate  Gaussian  process  (X,  T)  which  has 
mean  vector  zero  and  a  covariance  structure  given  by 


(2.2) 


"cov(X(s)f  X(t))  =  F(s)F(t),  Cov(Ks).  Y(t»  »  FC(s,  J)  [1  -  FC(t,  j)] 
Cov(Y(s),  X(t))  =  F°(s,  J)F(t) ,  Cov(X(s),  X(t»  *  FC(s.  J)  -  F(s)F(t,  J). 


12 


Thus,  by  Theorem  4  of  Breslow  and  Crowley  (1974),  the  process 
ys-CBa.  j(t)  -  Bj(t)  3  converges  weakly  to  the  Gaussian  process  Zj  2^) 
defined  by 

Zj,2(t)  “  /0  ^  dyC(‘*  J)  +  |xCt.>/F<*t)l  -  j\  d(1  /?), 

where  (X,  Y)  is  the  bivariate  mean  £  Gaussian  process  satisfying  (2.2). 

Furthermore,  the  covariance  structure  of  the  limiting  process  ZT  0(t) 

J,2 

can  be  obtained  in  a  manner  similar  to  that  in  Breslow  and  Crowley 
(1974). 

Combining  this  result  with  Theorem  2.2  above,  we  have 

TE^OT6*11  2tl«  The  process  ^(t)  -  Hj(t)3  converges  weakly  to  the 

Gaussian  process  Zj  t (t)  +  Zj  2(t)  =  Zj(t). 

Remark  2.4.  The  covariance  structure  of  the  limiting  process  in  Theorem 
2.3,  as  well  as  in  the  remaining  theorems,  may  be  obtained  in  a  tedious 
but  straightforward  manner.  The  exact  derivations  are  given  in  a  later 
report. 

Consider  now  (2.1).  Since  V^C^.jU)  -  Hj(t)]  converges  weakly,  the 

second  term  in  (2.1)  converges  to  0  in  probability.  Thus,  we  have 

Theorem  2.5.  The  process  •v/’n^Q  (t)  -  Q,(t)3  converges  weakly  to  the 

*  n,  j  « 

Gaussian  process  Zj(t)»Gj(t),  where  Zj(t)  is  the  limiting  process  in  Theorem  2.3. 
Remark  2.6.  Finally,  by  an  application  of  the  so-called  8 -method 
L  cf.  Rao  (1973)3  »  we  see  that  the  estimator  Mj(t)  given  by  (2.2)  also 
converges  weakly  to  a  Gaussian  process. 
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